The Wackamole Approach to Fault Tolerant Networks Demo
نویسندگان
چکیده
Maintaining the availability of critical network servers is an important concern for many organizations. Server redundancy is the traditional approach to provide availability in the presence of failures. From the client perspective, a network-accessible service is resolved via a set of public IP addresses specified for this service. Therefore, the continued availability of a service via these IP addresses is a prerequisite for providing uninterrupted service to the client. In order to function correctly, each of the service’s public IP addresses has to be covered by exactly one physical server at any given time. If no physical server covers a public IP address, the clients will not receive any service. On the other hand, if more than one physical server is covering the same IP address at the same time, the network might not function properly and clients may not be served correctly. A sizable market exists for hardware solutions that maintain the availability of IP addresses, usually via a gateway that hides the actual servers behind a smart switch or router in a centralized manner. We present Wackamole [1], a high availability tool for clusters of servers. Wackamole ensures that a server handles the requests that arrive on any of the service’s public IP addresses. Wackamole is a completely distributed software solution based on a provably correct algorithm that negotiates the assignment of IP addresses among the available servers upon detection of faults and recoveries, and provides N-way fail-over, so that any one of a number of servers can cover for any other. Using a simple algorithm that utilizes strong group communication semantics, Wackamole demonstrates the application of group communication to address a critical availability problem at the core of the system, even in the presence of cascading network or server faults and recoveries.
منابع مشابه
Design of an Active Approach for Detection, Estimation and Short-Circuit Stator Fault Tolerant Control in Induction Motors
Three phase induction motors have many applications in industries. Consequently, detecting and estimating the fault and compensate it in a way that the faulty induction motor satisfies the predefined goals are important issues. One of the most common faults in induction motors is the short circuit of the stator winding. In this paper, an active fault-tolerant control system is designed and pres...
متن کاملUsing Sliding Mode Controller and Eligibility Traces for Controlling the Blood Glucose in Diabetic Patients at the Presence of Fault
Some people suffering from diabetes use insulin injection pumps to control the blood glucose level. Sometimes, the fault may occur in the sensor or actuator of these pumps. The main objective of this paper is controlling the blood glucose level at the desired level and fault-tolerant control of these injection pumps. To this end, the eligibility traces algorithm is combined with the sliding mod...
متن کاملOn Feasibility of Adaptive Level Hardware Evolution for Emergent Fault Tolerant Communication
A permanent physical fault in communication lines usually leads to a failure. The feasibility of evolution of a self organized communication is studied in this paper to defeat this problem. In this case a communication protocol may emerge between blocks and also can adapt itself to environmental changes like physical faults and defects. In spite of faults, blocks may continue to function since ...
متن کاملAN INTELLIGENT FAULT DIAGNOSIS APPROACH FOR GEARS AND BEARINGS BASED ON WAVELET TRANSFORM AS A PREPROCESSOR AND ARTIFICIAL NEURAL NETWORKS
In this paper, a fault diagnosis system based on discrete wavelet transform (DWT) and artificial neural networks (ANNs) is designed to diagnose different types of fault in gears and bearings. DWT is an advanced signal-processing technique for fault detection and identification. Five features of wavelet transform RMS, crest factor, kurtosis, standard deviation and skewness of discrete wavelet co...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کامل